GPGPU Processing in CUDA Architecture
نویسندگان
چکیده
The future of computation is the Graphical Processing Unit, i.e. the GPU. The promise that the graphics cards have shown in the field of image processing and accelerated rendering of 3D scenes, and the computational capability that these GPUs possess, they are developing into great parallel computing units. It is quite simple to program a graphics processor to perform general parallel tasks. But after understanding the various architectural aspects of the graphics processor, it can be used to perform other taxing tasks as well. In this paper, we will show how CUDA can fully utilize the tremendous power of these GPUs. CUDA is NVIDIA’s parallel computing architecture. It enables dramatic increases in computing performance, by harnessing the power of the GPU. This paper talks about CUDA and its architecture. It takes us through a comparison of CUDA C/C++ with other parallel programming languages like OpenCL and DirectCompute. The paper also lists out the common myths about CUDA and how the future seems to be promising for CUDA.
منابع مشابه
Spoc: GPGPU Programming through Stream Processing with OCaml
ions Skeletons and Composition : Tomorrow 4:30pm OpenGPU workshop DSL Embedded language to express kernel Real World Use Case 2DRMP : Dimensional R-matrix propagation (Computer Physics Communications) Simulates electron scattering from H-like atoms and ions at intermediate energies Multi-Architecture: MultiCore, GPGPU, Clusters, GPU Clusters Translate from Fortran + Cuda to OCaml+SPOC + Cuda/Op...
متن کاملParallel processing for SAR image generation in CUDA – GPGPU platform
High resolution imagery from synthetic aperture radar (SAR) video data requires numerical computations of the order of gigaflops (GFLOP). The computational burden increases with the image size and the amount of input raw video signals. General purpose graphic processor units (GPGPU) can play a pivotal role in parallel processing the raw video data to generate SAR imagery in a much faster proces...
متن کاملBarra, a Parallel Functional GPGPU Simulator
We present a GPU functional simulator targeting GPGPU based on the UNISIM framework which takes unaltered NVIDIA CUDA executables as input. It simulates the native instruction set of the Tesla architecture at the functional level and generates detailed execution statistics. Simulation speed is competitive with the less-accurate CUDA emulation mode thanks to optimizations which exploit the inher...
متن کاملSoft GPGPUs for Embedded FPGAs: An Architectural Evaluation
We present a customizable soft architecture which allows for the execution of GPGPU code on an FPGA without the need to recompile the design. Issues related to scaling the overlay architecture to multiple GPGPU multiprocessors are considered along with application-class architectural optimizations. The overlay architecture is optimized for FPGA implementation to support efficient use of embedde...
متن کاملPerformance Comparison of Asynchronous Transfer Configurations for UHD Game Image Compression with GPGPU
Ultra high definition (UHD) game scenes have caused the memory bandwidth problem. The lossless DPCM-GR based compression algorithm [12] using NVIDIA CUDA(Compute Unified Device Architecture) like general purpose GPU (GPGPU) computing relieves the bandwidth problem without sacrificing image quality, which supports bit parallel pipelining. This paper increases the memory bandwidth efficiency usin...
متن کاملAdaptable and Efficient Variable Size Template Matching in CUDA
Introduction Increasingly flexible GPUs and the advent of GPGPU (General Purpose GPU) languages, such as Nvidia’s CUDA and the OpenCL standard, offer potential peak performance that far exceeds that of general purpose CPUs for a variety of problems. However, architectural and programming restrictions often prevent programmers from achieving peak performance. Even for problems that map well to c...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1202.4347 شماره
صفحات -
تاریخ انتشار 2012